Modern Pathology — Latest Matching Preprints

1

Unsupervised Tissue Concepts for Explainable Sarcoma Subtype Prediction from H&E

Bisson, T.; Ingram, D.; Singh, S.; Li, A.; Flynn, S.; Wang, W.-L.; Kim, A. E.; Bridge, C. P.; Demicco, E. G.; Sorrentino, A.; Jiang, S.; Hung, Y. P.; Lazar, A. J.; Iafrate, A. J.

2026-05-20 pathology 10.64898/2026.05.15.26353333 medRxiv

Top 0.1%

58.8%

Show abstract

Soft tissue sarcomas are a rare, heterogeneous group of tumors whose diagnosis remains challenging because of overlapping morphology and limited access to sarcoma-specialized pathologists. Although pathology foundation models have shown promise in computational pathology, their clinical translation remains limited by insufficient interpretability, particularly in diagnostically complex settings such as sarcoma diagnosis. Here, we developed and evaluated an H&E-based AI framework for sarcoma subtype classification that focused on explanability. Using the CONCH v1.5 foundation model, we computed embeddings from a tissue microarray cohort of 2,545 cases spanning 19 sarcoma subtypes and trained an attention-based multiple-instance learning model that achieved a balanced accuracy of 77.38% (SD 1.88). To move explainability beyond attention-based localization, we trained a sparse autoencoder on patch-level embeddings to learn 768 recurring visual concepts. 90 high-activation concepts were reviewed by three senior pathologists and curated into morphologically meaningful and non-meaningful categories, yielding a semantic dictionary of 41 diagnostically relevant tissue concepts. We then trained a linear attention-based model on the 768-concept vectors, which retained much of the performance of the raw embedding-based ABMIL model, achieving a balanced accuracy of 73.74% (SD 1.30). When restricting the linear model to pathologist-curated morphologic concepts only, balanced accuracy further decreased to 67.04% (SD 1.27), suggesting that the residual performance gain in the full concept model was driven by inconsistent, technical, or diagnostically irrelevant concepts. Concept-level explanations of the curated linear attention-based model aligned with known sarcoma morphology, including lipogenic, myxoid, spindle-cell, pleomorphic, vascular, small round blue cell, and matrix-forming patterns, and reproduced patterns of diagnostic overlap observed in human sarcoma pathology. Together, these results show that H&E-based foundation-model representations capture meaningful diagnostic structure within the known limitations of H&E in sarcoma diagnostics, but that their clinical value depends on whether this structure can be made interpretable to pathologists. Sparse autoencoder-derived concepts can address this critical gap by converting embedding-level signal into recurring morphologic patterns that pathologists can review and name, providing the foundation to link these patterns to subtype predictions. In doing so, this approach turns concept discovery into a practical form of diagnostic explanation, while also revealing where model performance is supported by recognizable histopathology and where it relies on diagnostically irrelevant or inconsistent visual patterns.

2

Interpretable morphology mapping of peripheral blood leukocytes using annotation-efficient artificial intelligence

Liu, Z.; Castillo, S. P.; Han, X.; Sun, X.; Hu, Z.; Yuan, Y.

2026-05-26 pathology 10.64898/2026.05.22.725537 medRxiv

Top 0.1%

52.6%

Show abstract

BackgroundPeripheral blood smears (PBS) review is labor-intensive, subjective, and challenging for rare or morphologically heterogeneous cell types in hematologic malignancies. Artificial intelligence (AI) offers a scalable alternative, but broader clinical translation is constrained by annotation burden and limited interpretability. MethodsWe developed an interpretable, annotation-efficient AI framework that learns leukocyte morphology through a two-stage process: label-free representation learning to construct a morphological embedding space, followed by supervised fine-tuning for cell type and morphological attribute classification. The model was trained and evaluated on 5,952 PBS images from cancer patients at MD Anderson Cancer Center, including blast cells, and 17,092 images from public sources. Active learning strategies were assessed to improve label efficiency, and interpretability was examined using saliency and embedding visualization. An interactive web application, HemoSight, was developed to support clinical review. FindingsThe framework achieved a macro-F1 score of 0{middle dot}96 for 9-way leukocyte classification on the internal test split and 0{middle dot}83 on the held-out patient cohort. Active learning substantially reduced annotation requirements, reaching peak performance with only 13{middle dot}3% of available labels and significantly improving learning efficiency across 8 of 9 cell types. The model generalized to classifying 11 leukocyte morphological attributes with a mean F1 score of 85{middle dot}8% and revealed structured morphological landscapes. Saliency maps, embedding visualizations, and the HemoSight application enabled transparent morphological inspection of model predictions, supporting confidence in model behavior and feasibility for clinical integration. InterpretationOur framework enables scalable, annotation-efficient, and interpretable modeling of leukocyte morphology, supporting the integration of AI-assisted PBS review for hematopathology workflows. FundingSeed funding from The University of Texas MD Anderson Cancer Center. Research in ContextO_ST_ABSEvidence before this studyC_ST_ABSPeripheral blood smear review is essential for diagnosing and monitoring hematologic malignancies, but manual case review is time-consuming and variable, particularly for rare or abnormal leukocyte types. Automated hematology analyzers are widely used to flag abnormal cells; however, they provide limited morphological insight and often require frequent manual correction, especially in cancer settings where disease and treatment alter cell appearance. Previous artificial intelligence approaches for leukocyte classification have shown promise, but most rely on fully supervised learning, require extensive expert annotation, focus on a limited set of cell types, and frequently exclude diagnostically important rare cells such as blasts. Interpretability is inconsistently addressed, and few studies provide tools that allow clinicians to inspect and interpret model outputs within routine workflows. Added value of this studyThis study introduces an annotation-efficient framework trained on a large collection of peripheral blood smear images, including cancer patient samples with hematopathologist-verified rare cell types such as blasts. The framework learns leukocyte morphology from unlabeled images and adapts to multiple classification tasks with minimal expert labeling. Performance is evaluated on both internal test splits and a held-out patient cohort to provide a realistic estimate of generalization. Iterative, uncertainty-guided annotation substantially reduces labeling requirements while improving learning efficiency across most leukocyte classes. Beyond cell-type classification, the framework is extended to 11 clinically relevant morphological attributes and reveals a structured morphological landscape. These capabilities are integrated into a web application, HemoSight, enabling real-time inference and transparent morphological inspection of predictions within hematopathology workflows. Implications of all the available evidenceAdvancing artificial intelligence for hematology requires methods that reduce expert labeling demands, provide interpretable outputs, and perform reliably across clinically diverse patient samples. This study shows that learning from largely unlabeled data combined with iterative expert annotation can support scalable and flexible modeling of leukocyte morphology for classification tasks. Integrating quantitative predictions and interactive visualization supports the use of artificial intelligence as an assistive tool for diagnostic peripheral blood smear review, with potential to improve efficiency, consistency, and reviewer confidence.

3

Cytoplasmic staining of T cell receptor components enables efficient assessment of lineage and clonality in surface CD3-negative T cell neoplasms

Wilk, A. J.; Gitana, G.; Oak, J.

2026-06-04 pathology 10.64898/2026.06.02.26354783 medRxiv

Top 0.1%

28.0%

Show abstract

Flow cytometry can establish T cell clonality by detecting a restricted expression pattern of the T cell receptor (TCR) {beta} constant region (TRBC), expressed in association with CD3. However, T cell neoplasms frequently lose surface expression of the CD3/TCR complex, posing a challenge to demonstrating T cell lineage and clonality. To address this challenge, here we present a 12-color flow cytometry panel, called cytoTCR, to characterize cytoplasmic expression of CD3/TCR complex components. We apply cytoTCR to 38 patient specimens with immunophenotypically abnormal T cell populations, demonstrating this approach can efficiently establish T cell lineage and clonality in challenging T cell neoplasms that have lost surface CD3 expression. While we show that natural killer (NK)-lineage neoplasms can express cytoplasmic CD3 at similar levels to T cells, we show that absent expression of cytoplasmic TCR components by mature lymphocytes can help confirm NK cell lineage. We demonstrate that cytoTCR can detect cytoplasmic TRBC-restriction in challenging cases of null-phenotype anaplastic large cell lymphoma, which lack surface expression of pan-T cell antigens. In cases of T-lymphoblastic leukemia, cytoTCR shows that cytoplasmic TRBC expression matches the expected developmental stage of the leukemia. Finally, we use cytoTCR to characterize atypical cCD3-CD7- T cells in a patient with a history of T-lymphoblastic leukemia as well as recent CAR-T therapy, showing that this atypical population is polytypic and represents CAR-T product rather than residual disease. Our study presents a broadly applicable flow cytometric approach to simultaneously assess T cell lineage and clonality in suspected T lineage populations with absent surface CD3 expression.

4

Closing the Paediatric Gap: Adult-Trained AI Generalises Robustly to Paediatric Coeliac Disease Diagnosis

Jaeckle, F.; Gillett, P. M.; Kirkwood, K. J.; Natu, S.; Chan, J. Y. H.; Bateman, A. C.; Arends, M. J.; Soilleux, E. J.

2026-06-05 pathology 10.64898/2026.06.04.26354889 medRxiv

Top 0.1%

23.9%

Show abstract

Background Coeliac disease (CD) diagnosis on duodenal biopsies is limited by interobserver variability. We have previously demonstrated pathologist-level performance with our artificial intelligence (AI) model for the histopathological diagnosis of adult CD, but not in paediatric practice. As paediatric CD screening programmes expand internationally, accurate and scalable diagnostic tools are needed. We investigated whether an AI model trained exclusively on adult whole-slide images (WSIs) can generalise to paediatric CD diagnosis across independent centres. Methods A training and validation dataset of 9,958 WSIs from 8,421 adult patients (961 CD) from five centres was used to develop an ensemble of multiple-instance learning models using features from a foundation model. Testing was performed on 708 consecutive paediatric patients (86 CD) from two centres (Edinburgh and Southampton) not included in training. Model calibration was assessed, and probability outputs were grouped into clinically interpretable categories. Findings In adult cross-validation, the AI model achieved an area under the receiver operating characteristic curve (AUC) of 98.7%, sensitivity of 84.9%, specificity of 99.0%, and negative predictive value (NPV) of 98.1%. On testing (paediatric) datasets, performance remained high (AUC 98.8%, sensitivity 80.2%, specificity 98.4%, NPV 97.3%). Restricting analysis to predictions outside the intermediate-probability range (predicted CD probability <10% or [≥]65%; 85.3% of cases) improved sensitivity to 100% and specificity to 98.7%. No misclassifications were observed among high-confidence predictions (<2% or [≥]85%; 66.0% of cases). The expected calibration error was 0.03. Performance improved significantly when biopsies from both duodenal sites (bulb [D1] and descending [D2/3]) were considered. Interpretation Our AI model, trained on adult biopsies, generalises to paediatric CD diagnosis across centres and scanner platforms. Well-calibrated probability outputs provide clinically interpretable measures of diagnostic confidence and could support safe identification of CD-negative biopsies within defined thresholds. These findings demonstrate the feasibility of applying adult-derived AI models in paediatric populations and reinforce the importance of multi-site (D1 & D2) biopsy sampling.

5

Interpretable machine learning for coeliac disease diagnosis: quantitative morphometry of duodenal biopsies

Bryant, R.; Romero Diaz, J.; Scott, A. G.; Sagdeo, A. A.; Jenkins, G. Z.; Richardson, R. A.; Chan, J. Y. C.; Arends, M. J.; Soilleux, E. J.; Jaeckle, F.

2026-06-03 pathology 10.64898/2026.06.02.26354731 medRxiv

Top 0.1%

22.9%

Show abstract

Background Coeliac disease affects approximately 1% of the global population and remains substantially underdiagnosed. Histopathological assessment of duodenal biopsies is the diagnostic gold standard but is subject to approximately 20% inter-observer disagreement. While machine learning approaches show promise, most prior work relies on black-box models with limited interpretability, restricting clinical adoption. Methods We present an interpretable pipeline that follows established histopathological criteria by extracting clinically meaningful morphological features from H&E-stained whole-slide images. Five sequential stages perform pre-processing, semantic segmentation of villi, crypts, intraepithelial lymphocytes (IELs) and enterocytes, crypt morphometry, villus length estimation via a novel polyline-based keypoint model, and coeliac disease classification using three quantitative features: IEL-to-enterocyte ratio, villus-to-crypt area ratio, and villus-length-to-crypt-depth ratio. Training and validation used data from four institutions; independent testing used 1,357 WSIs from two further institutions including one with a previously unseen scanner manufacturer, spanning five diagnostic categories: coeliac disease, normal mucosa, chronic inflammation, gastric metaplasia, and gastric heterotopia. Results Semantic segmentation achieved villus and crypt precision and recall of 87-90%. Villus length estimation correlated strongly with expert annotations (Pearson's r=0.85, mean relative error 13.5% post-calibration). All three morphological features significantly separated coeliac disease from all non-coeliac diagnostic groups across internal and external datasets (p<0.01 in all comparisons). On the test set the diagnostic classifier achieved accuracy 94.5%, PPV 92.9%, NPV 94.7%, and AUC 0.982. Conclusions This interpretable framework achieves strong multi-centre diagnostic performance while producing quantitative morphological outputs, villus length, crypt depth, and IEL-to-enterocyte ratios, that directly reflect established histopathological criteria, representing a meaningful step towards standardised AI-assisted coeliac disease diagnosis.

6

SortIT - A Tool For Assessing Observer Variability And Creating Ground Truth Image Classification Datasets

Uegami, W.; Bisson, T.; Okoshi, E. N.; Costa da Silva, F. G.; Jiragawasan, C.; Zerbe, N.; Bychkov, A.; Fukuoka, J.

2026-05-29 pathology 10.64898/2026.05.28.728616 medRxiv

Top 0.1%

22.9%

Show abstract

Interobserver variability in pathological assessments is a well-recognized challenge that impacts diagnostic reliability and disease understanding. This variability exists across many subspecialties due to the subjective nature of evaluations. Artificial intelligence (AI) applied to whole slide images has potential to standardize procedures and reduce variability in pathology, but transitioning to these technologies does not guarantee improvement. Establishing reliable ground truth datasets with consensus annotations is crucial for developing robust AI solutions. We introduce SortIT, an open-source web application that facilitates systematic creation and evaluation of ground truth image tile annotations. SortIT enables multiple annotators to independently label tiles, with flexible user permission controls. Annotated data can be exported for statistical analysis of observer variation and for creating ground truth datasets from consensus tiles. We outline protocols using SortIT for several use cases: (1) mitosis segmentation in tumor regions, (2) evaluating AI solutions for prostate cancer grading by comparing to expert consensus, and (3) granuloma classification by annotating discriminative tile-level features. Key strengths of SortIT lies in its ease of deployment, making it accessible and usable for a wide range of users. Overall, SortIT provides a valuable tool to establish high-quality ground truth datasets and comprehensively assess observer variability. Critical evaluation of ground truth quality using systematic annotation methodologies is crucial for developing accurate and generalizable diagnostic AI tools. Its open-source nature facilitates community adoption and further development.

7

Whole slide image analysis of the endometrial decidual reaction reveals multiscale perturbations associated with miscarriage

Wright, G.; Rawlings, T. M.; Eastwood, M.; Brighton, P.; Taus Nebot, M.; Estermann, A.; Flett, W. T. M.; Younis, A.; Makwana, K.; Yoshihara, H.; Aplin, J. D.; Kong, C.-S.; Christian, M.; Lucas, E. S.; Muter, J.; Brosens, J. J.; Minhas, F.

2026-05-26 pathology 10.64898/2026.05.22.727262 medRxiv

Top 0.1%

22.3%

Show abstract

The inflammatory decidual reaction renders the cycling endometrium transiently permissive for embryo implantation before transforming it into the decidua, the maternal bed accommodating the fetal placenta during pregnancy. Disruptions in decidual tissue remodeling are linked to miscarriage and other pregnancy disorders. However, endometrial assessment is hampered by a lack of affordable technologies capable of mapping the spatiotemporal dysregulation of this dynamic and complex tissue. Employing a graph neural network on whole slide images of 493 CD56-immunostained endometrial samples, Endometronome was developed as a deep learning tool to spatially track the decidual reaction and provide accurate estimates of marker gene expression. When applied to 2,690 additional biopsies, this model consistently identified morphological correlates of prior miscarriage burden, a proxy for future risk. Further, a morphological signature indicative of metabolic glandular impairment discriminated between clinical miscarriage presentations. These findings illustrate how advanced imaging analysis of routine histology can transform miscarriage prevention strategies.

8

Assessing Foundation Models for Computational Pathology in Endometrial Cancer

Volinsky-Fremond, S.; van den Berg, N.; Barkey Wolf, J.; Schoenpflug, L. A.; Andani, S.; Ortoft, G.; Jobsen, J. J.; Lutgens, L. C.; Powell, M. E.; Mileshkin, L. R.; Mackay, H.; Leary, A.; Razack, R. R.; de Bruyn, M.; de Boer, S. M.; Nout, R. A.; Smit, V. T.; Creutzberg, C. L.; Koelzer, V. H.; Bosse, T.; Horeweg, N.

2026-05-25 pathology 10.64898/2026.05.22.26353897 medRxiv

Top 0.1%

22.2%

Show abstract

Computational pathology leverages deep learning to extract clinically relevant information from digitized tumor slides, predicting histopathological subtypes, molecular alterations, and patient outcomes. Recent pipelines increasingly rely on foundation models trained on large pan-cancer datasets to generate generalizable features. In endometrial cancer (EC), their comparative performance for clinical diagnostic tasks remains unexplored. For the first time, this study evaluates the performance of seven state-of-the-art foundation models across morphological, molecular, and prognostic tasks using a large EC dataset of 3,293 patients from randomized trials and clinical cohorts. In addition, their performance was compared to one model (EsVIT) exclusively trained on EC. The foundation models H-OPTIMUS-0, CONCH, and VIRCHOW2, achieved the highest mean performance, but the best-performing foundation model varied by task. The top-performing foundation model outperformed the EC-specific feature extractor EsVIT across all tasks. This study highlights the superiority of foundation models over a domain-specific feature extractor in EC. Selecting the optimal foundation model for novel tasks remains challenging due to performance plateaus and limited information on the training datasets, requiring rigorous benchmarking and domain insight to reach maximum potential.

9

Spatial transcriptomic analysis reveals coordinated gene expression in ovarian clear cell carcinoma and adjacent endometriosis in UK and Japanese patients

Kuroda, T.; Giannone, G.; Ennis, D. P.; Mirza, H. B.; Marks, D.; Flood, L.; Sisley, M.; Griffin, R.; Desai, S.; McDermott, J.; Lambie, N.; Fukasawa, N.; Kiyokawa, T.; Shimoda, M.; Saito, M.; Koba, T.; Saito, R.; Kawabata, A.; Takenaka, M.; Valabrega, G.; Matthews, N.; Tookman, L. A.; Yanaihara, N.; Okamoto, A.; McNeish, I. A.

2026-06-02 pathology 10.64898/2026.05.29.728698 medRxiv

Top 0.1%

18.6%

Show abstract

PurposeOvarian clear cell carcinoma (OCCC) is strongly associated with endometriosis and shows geographic variation in incidence. We investigated whether OCCC and adjacent endometriosis exhibit distinct transcriptional states and whether these patterns differ between United Kingdom (UK) and Japanese cohorts. Experimental DesignWe performed whole-transcriptome spatial profiling on specimens from 16 OCCC cases (8 UK, 8 Japan) in which tumor and endometriosis were both present. Gene expression was analyzed in tumor, endometriosis and stroma. ARID1A status was assessed by immunohistochemistry. ResultsMedian age was 59 years (range 26-82). 13/16 cases (81.3%) had early-stage disease. Tissue compartment rather than cohort of origin was the dominant source of variation across endometriosis and tumor regions. Endometriosis was enriched for inflammatory and immune-related pathways compared to tumor, whilst there was greater representation of chromatin and protein-DNA complex assembly pathways in tumor regions. These patterns were conserved across both cohorts and after stratification by ARID1A status. Mesenchymal-associated gene expression scores also significantly differed across stroma, endometriosis and tumor with clear compartmental separation. Cell type deconvolution analyses showed clear compositional differences between stromal and epithelial disease compartments. ConclusionsOCCC and coexisting endometriosis are transcriptionally distinct, with the dominant contrast being compartmental rather than geographic. ARID1A alone is unlikely to account for the principal spatial transcriptional states identified here. Further analyses will be required to ascertain whether these differences reflect genuine biological differences between OCCC and coexisting endometriosis or represent different stages of endometriosis-associated tumorigenesis. Translational RelevanceOvarian clear cell carcinoma often arises in association with endometriosis, yet the biological transition between these lesions remains poorly understood. Using spatial transcriptomics in matched tumor and adjacent endometriosis from Japanese and UK cohorts, we showed that endometriosis is characterized by inflammatory and antigen-presentation features, whereas tumor regions showed chromatin-organization and oncogenic transcriptional states. These patterns were largely maintained irrespective of ARID1A status and geographic background. In addition, spatial deconvolution suggested differences in local immune composition, with tumor regions showing relatively greater neutrophil- and T cell-associated signals. Together, our data suggest that OCCC and coexisting endometriosis share a spatially linked tissue context, but that tumor regions have distinct transcriptional profile and microenvironment that may be involved in the malignant transformation and inform interpretation of molecular classification in endometriosis-associated OCCC.

10

Foundation model-based tool for automated ulcerative colitis histology scoring demonstrates non-inferiority to pathologists across multiple scoring indices

Tahir, W.; Shamshoian, J.; Tauber, J.; Clinton, L. K.; Griffin, M.; Shah, C.; Singh, G.; Fahy, D.; Sucipto, K.; Brosnan-Cashman, J.; Altepeter, T. A.; Bhattacharya, S.; Crandall, W.; Duan, C.; Gale, J. D.; Gupta, V.; Haarmann, H.; Harpaz, N.; Hooper, A. T.; Horowitz, J.; Hurtado-Lorenzo, A.; Hussaini, B. E.; Jairath, V.; Jones, A.; Kostiuk, B.; Kukreja, A.; Laroux, F. S.; Lissoos, T.; McBride, R. B.; Najdawi, F.; Nayyar, A.; Osterman, M. T.; Panchal, P.; Ruane, D.; Travis, S.; Visvanathan, S.; Wilson, L.; Jayson, C.

2026-06-11 pathology 10.64898/2026.06.09.26355212 medRxiv

Top 0.1%

14.8%

Show abstract

In clinical trials for ulcerative colitis (UC), pathologists assess disease severity through standardized histological indices, including the Geboes Score, Robarts Histopathology Index (RHI), and Nancy Histologic Index (NHI). Despite strong associations with clinical outcomes, histologic scoring suffers from inter- and intra-reader variability, and consensus criteria for histologic remission remain uncertain. Through a consortium approach, we developed an artificial intelligence-based measurement (AIM) tool for scoring histology in UC mucosal biopsies (AIM-HI UC). This model, trained on a large dataset of UC biopsies (N=10,230), utilizes additive multiple instance learning models leveraging PLUTO, a pathology foundation model, that predict each of the Geboes subgrades, from which the Geboes grade-level score, RHI, and NHI can be calculated. Evaluation of this model on a standalone verification set including clinical trial specimens established algorithm non-inferiority and/or superiority relative to standard qualified pathologists through comparison of algorithm-consensus and pathologist-consensus agreement metrics (non-inferior if difference >-0.1, superior if difference >0, inclusive of confidence intervals). AIM-HI UC was determined to be non-inferior to pathologists (N=3) for the prediction of all seven Geboes subgrades, grade-level Geboes, RHI, NHI, histologic improvement (GS<3.1), 2A histologic remission (GS<2A.0), and 2B histologic remission (GS<2B.0). AIM-HI UC was superior to pathologists for several Geboes subgrades (GS 0, GS 1, GS 2B, and GS 5), as well as grade-level Geboes, RHI, and positive percent agreement of 2A histologic remission. The model was shown to be greater than 99% repeatable for all histologic scoring metrics examined. Model-derived scores were shown to strongly correlate with canonical histologic features of inflammation, including the proportion of total epithelium that is inflamed (Spearman r=0.83; p<0.01), the proportion of neutrophils localized within crypt epithelium (Spearman r=0.83, p<0.01), and the amount of mucosal area classified as erosion or ulceration (Spearman r=0.80, p<0.01). Overall, these results suggest that AIM-HI UC has the potential to improve consistency of UC histology interpretation, providing a path toward standardization of UC histology scoring in clinical trials.

11

Artificial intelligence-assisted ganglion cell detection in Hirschsprung's disease: A comparative evaluation of two deep learning approaches

Wang, E.; Grenier, K.; Savadjiev, P.; Poenaru, D. D.

2026-06-12 pathology 10.64898/2026.06.11.26354826 medRxiv

Top 0.1%

14.3%

Show abstract

Background. Definitive diagnosis of Hirschsprung's disease (HD) requires pathological identification of enteric ganglion cells. This process is time-consuming and subject to inter-observer variability. Artificial intelligence (AI) tools have the potential to standardize and accelerate this workflow, but no study has determined which AI approach best serves intraoperative HD pathology diagnostics. Method. This study compared the U-Net and You Only Look Once version 26 (YOLO26) frameworks for ganglion cell detection using a single-centre retrospective dataset of 54 whole-slide images (WSIs) from rectal biopsies. WSIs were tiled into 397,731 image patches (128x128 pixels), further partitioned into training (70%), validation (15%), and testing (15%) sets. Models were evaluated on tile- and patient-level diagnostic metrics and processing latency. Results. The U-Net achieved a tile-level sensitivity of 82.9%, showing no statistically significant difference compared to YOLO26 (79.1%; p = 0.097). However, YOLO26 demonstrated a statistically significant advantage in tile-level specificity (96.1% vs. 93.9%; p < 0.001) and reduced mean inference latency (7.64 ms vs. 11.57 ms/tile). At the patient level, both models achieved 100% diagnostic sensitivity. Despite low patient-level specificity (0.0% U-Net; 11.8% YOLO26), the tissue-level diagnostic burden of false positives was 6.00% for U-Net and 3.50% for YOLO26. Conclusion. The U-Net is preferred when nominal gains in sensitivity are prioritized, while the YOLO26 is an alternative that optimizes efficiency and false positive suppression. Both models serve as robust screening filters to augment the pathologist's workflow and should be selected based on workflow requirements. Prospective validation on larger, multi-centre datasets is required before clinical implementation.

12

Label-free 3D virtual histology of human formalin-fixed paraffin-embedded (FFPE) prostate needle biopsies with propagation-based phase-contrast micro-CT (PBCT)

Sugarman, A. L.; Vanselow, D. J.; Chen, G.; Clark, E.; Parkinson, D.; La Riviere, P.; Silverman, J.; Warrick, J.; Cheng, K. C. C.

2026-06-01 pathology 10.64898/2026.05.28.728215 medRxiv

Top 0.1%

12.3%

Show abstract

For over a century, the goal of estimating clinical outcome from tumor biopsies has been based on histomorphology of 2D tissue slices that represent a small fraction of collected samples. Its power derives from histologys 1) unbiased representation of cell types, 2) subcellular resolution that allows the characterization of health and disease states across cell types, and 3) multi-millimeter fields of view that allow assessment of tumor heterogeneity. Histologys dependence upon physical slices, however, limits assessment of 3-dimensional cellular volumes and tissue architecture. Here, we used propagation-based phase-contrast micro-CT (PBCT) to create 3D histological images of residual formalin-fixed, paraffin-embedded (FFPE) prostate needle biopsies. The resulting isotropic, grey-scale, 0.5 micron voxel matrices were used to explore the potential of for the 3D virtual histology to distinguish diagnostic categories including benign prostatic tissue and prostatic adenocarcinoma of Gleason patterns 3, 4, and 5. Maximum intensity projections of stacks of digital slices totaling 5 microns "slices" allowed the study of virtual sections corresponding to actual serial H&E-stained sections of tissue cut after micro-CT imaging. Like histology, our PBCT reconstructions allowed us to distinguish between non-infiltrative and undulating glands of benign prostatic tissue, infiltrative round glands of Gleason pattern 3, cribriform structures of Gleason pattern 4, and comedonecrosis of Gleason pattern 5. Unlike histology, micro-CT allowed us to further probe 3D tissue architecture in volumetric context. User-friendly exploration of sample volumes was achieved using a customized Neuroglancer multiplanar and 3D rendering interface. Sparsely trained cycleGAN produced plausible virtual H&E staining from the unstained micro-CT reconstructions. Unlike tissue-section based histology, micro-CT-based virtual histology yields nondestructive 3D characterization of cancer cell and tissue architecture, including glandular spaces, without the undersampling or cutting artifacts of histology. These findings demonstrate the feasibility of PBCT-based 3D virtual histology of prostate cancer and suggest the exploration of derived quantitative analyses of tumor properties for potential contributions to patient care.

13

An Interactive Trustworthy AI Pathology Copilot to Improve Biomarker-Driven Prognostic Stratification and Therapeutic Response Prediction

Mao, Y.; Xie, C.; Li, F.; Li, D.; Zhang, W.; Zhang, Y.; Li, B.; Zhao, C.; Zhang, Z.; Tan, Y.; Cen, Z.; Tao, H.; Yang, J.; Wang, J.; Feng, Q.; Liu, B.; Liang, L.; Lu, C.; Zhang, Y.; Ning, Z.

2026-05-19 pathology 10.64898/2026.05.17.26352870 medRxiv

Top 0.1%

10.3%

Show abstract

Predictive assays for precision oncology increasingly rely on multi-scale biomarkers that manifest as morphologic signatures in routine whole-slide images (WSIs). However, most computational pathology models treat biomarker profiling and outcome prediction (i.e., prognostic stratification and therapeutic response) as independent tasks, and lack the interactive and trustworthy capabilities required for clinical translation. Here, we present TEAM, an interactive trustworthy AI pathology copilot that improves biomarker-driven outcome prediction. Pretrained on 55,648 pan-cancer WSIs and 1,750,648 regions of interest (ROIs), comprising 360 million patches, TEAM learns risk-aware embeddings by conditioning on clinical metadata and aligning with relative risk prior. For trustworthy assessment, TEAM quantifies patch-level data (aleatoric) and model (epistemic) uncertainty, then propagates these estimates to patient-level predictions. In outcome prediction, profiled biomarkers serve as intermediate features to contextualize prognostic and therapeutic estimates. Beyond passive prediction, TEAM integrates vision-language models with agentic orchestration for clinical reasoning, and provides a web-based clinician-in-the-loop interface for interactive prediction refinement. Evaluated across 48 multi-institutional cohorts encompassing 85 benchmarks, TEAM consistently outperforms existing methods across biomarker profiling, prognostic stratification, and therapeutic response prediction, supporting trustworthy AI-assisted decision-making in computational pathology.

14

Understanding Human AI Discrepancy in Breast Cancer TIL Assessment: A Multi-Rater and Perceptual Bias Study

Capar, A.; Aloglu, I.; Aker, F.; Ertano, M.; Mese, Y. E.; Ungor, A.; Yildiz, B. E.

2026-06-04 pathology 10.64898/2026.05.29.26354196 medRxiv

Top 0.1%

6.9%

Show abstract

Objective: Tumor-infiltrating lymphocytes (TILs) in breast cancer are one of the most important indicators of the immune response within the tumor microenvironment. They play a particularly significant prognostic and predictive role in triple-negative and HER2-positive subtypes. However, substantial inter-observer variability has been reported in TIL scoring among pathologists, which limits its reliability in clinical practice. The aim of this study was to evaluate the agreement between artificial intelligence (AI) models and pathologists in TIL scoring and to compare this agreement using different statistical approaches, thereby assessing the potential of AI integration into pathology practice. Materials and Methods: Digitized histopathological images of breast cancer cases were included in the study. Tumor regions annotated by pathologists were evaluated for both stromal TIL percentage and the proportion of stromal tumor area within each ROI, with assessments performed independently by three pathologists and two AI models. Agreement was assessed among pathologists, between pathologists and AI, and between AI models. Statistical analyses included intraclass correlation coefficient (ICC), Cohen and Fleiss kappa, correlation tests, and Bland-Altman analysis. In addition, categorical agreement was examined using different cut-off values. Results: Inter-pathologist agreement was high, with an ICC of 0.81. In contrast, the global agreement between pathologists and AI models was lower (ICC 0.41). Pairwise comparisons of pathologist-AI agreement yielded substantially lower ICC values (0.12-0.21), although this improved to 0.53 when three pathologists were assessed jointly with a single AI model. The strongest categorical agreement was observed with dichotomized TIL scores ([≤]10% vs. >10%), whereas multi-category classifications were associated with a marked reduction in kappa values. Spearman correlation coefficients between pathologists and AI models ranged from moderate to good ({rho} = 0.48-0.81). Agreement between the two AI models themselves was moderate, with an ICC of 0.64

15

DigitAb: Domain-Adaptive Cell Type Prediction Method from Light Microscopy Images

Lucarelli, N.; Winfree, S.; Sabo, A.; Barwinska, D.; Ferkowicz, M.; Bowen, W.; Singh, A.; Chen, K.; Tatke, A.; Jen, K.-Y.; Eadon, M. T.; El-Achkar, T. M.; Jain, S.; Sarder, P.

2026-05-21 pathology 10.64898/2026.05.19.726313 medRxiv

Top 0.1%

6.4%

Show abstract

Light microscopy imaging with histological stains is central to disease diagnosis and research. It is enhanced with immunostaining to reveal cellular composition and complexity linked to clinical utility and biological mechanisms. Emerging multiplex imaging technologies like Phenocycler markedly increase the coverage to capture the cellular diversity but are costly, technically demanding, and inaccessible to most clinical laboratories. We developed DigitAb, a deep learning framework that classifies cell types directly from hematoxylin and eosin (H&E) stained slides, eliminating the need for specialized assays. Using Phenocycler imaging, we generated highlZlresolution ground truths for [~]3.5 million cells from 29 human kidney samples across four multi-institutional datasets to train a semantic segmentation model for 10 cell types, achieving a balanced accuracy of 0.78. By employing an integrated adversarial domain adaptation module, we tested DigitAb on unlabeled and untested biopsy samples from kidney transplant and diabetic samples. We were able to predict several cell types just from histology images, without using any special technology or immunostains, and demonstrate high concordance with clinical gold-standard Banff schema in kidney transplant rejection, and clinical characteristics of diabetic nephropathy. Our cloudlZlbased tool, DigitAb, provides scalable, accessible, labellZlfree cellular segmentation for research and clinical pathology.

16

Bridging Cotyledon Pathology and Perfusion in Healthy Primate Pregnancy

Keding, L. T.; Liu, R.-Y.; Keding, T. J.; Vazquez, J.; Bockoven, C. G.; Shah, D. M.; Golos, T. G.; Wieben, O.; Stanic, A. K.

2026-05-21 pathology 10.64898/2026.05.18.726079 medRxiv

Top 0.1%

5.0%

Show abstract

IntroductionHealthy and diseased placentae alike often display some degree of pathology. However, quantitative techniques to characterize common pathologies and their relationship to local maternal hemodynamics in healthy primate placentae are currently limited. MethodsPlacentae from seven rhesus macaques were imaged by MRI at three time points across mid-to late-gestation, to quantify placental blood volume, flow, and perfusion from maternal spiral arteries across pregnancy. Near term, we collected placental cotyledons, digitized hematoxylin/eosin-stained slides, then segmented and annotated sub-tissues and major pathologies (intervillous gaps, fibrin deposition, villous agglutination, inflammatory agglutination, and stromal mineralization) within each cotyledon. Individual pathologies were assessed in relation to each other and MRI perfusion metrics, in a cotyledon-specific manner. Parallel analyses were performed to investigate both basic (Spearman correlation) and animal variance-negated (dimensionality-reduction) relationships. ResultsCotyledons with increased stromal mineralization demonstrated low blood perfusion across pregnancy, alongside significant compensatory changes. Mineralization was further associated with decreased fetal weight, across all sub-tissues. Dimensionality reduction revealed maternal vascular malperfusion-associated pathologies as the largest contributor to dataset variance. Additionally, pathologies commonly associated with healthy placental function demonstrated low cotyledon blood flow and volume at all timepoints, with no evidence of compensatory changes across gestation. ConclusionsComprehensive digital annotation revealed several relationships connecting pathology and maternal blood perfusion in the healthy primate pregnancy, at the smallest functional unit of the placenta. This methodological framework embeds pathologist-refined morphological expertise into a quantitative, spatially resolved format that can ground, rather than be replaced by, unsupervised computational approaches to placental analysis.

17

Immunohistochemical phenotype is associated with metastatic site in breast cancer: a retrospective pathomorphological study of women from the Lower Aral Sea region, Uzbekistan

Khodjaniyazov, A. A.; Rojobov, R. R.

2026-06-08 pathology 10.64898/2026.06.05.26354969 medRxiv

Top 0.2%

2.6%

Show abstract

Background: Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer death in women worldwide, and the great majority of these deaths are caused by metastatic disease. Whether the immunohistochemical (IHC) phenotype of breast cancer is associated with the anatomical site of metastasis has been characterized mainly in high-income, registry-based populations, while data from ecologically stressed and medically under-served regions such as the Lower Aral Sea basin are lacking. Methods: We retrospectively reviewed 652 women diagnosed with breast cancer at the Khorezm Branch of the Republican Specialized Scientific-Practical Medical Center of Oncology and Radiology (Uzbekistan) between 2020 and 2024, of whom 213 had metastatic disease (306 metastatic foci). Histological type was assessed on hematoxylin-eosin and van Gieson-stained sections; quantitative morphometry was performed in Fiji/ImageJ; and HER2, estrogen receptor (ER), progesterone receptor (PR) and Ki-67 were assessed by IHC. The association between marker expression and metastatic site (liver, lung, lymph node) was tested in 187 foci with adequate tissue using the chi-square test, with significance at p < 0.05. Results: Invasive ductal carcinoma predominated. Metastatic site was significantly associated with the IHC phenotype. Liver metastases showed the highest frequency of HER2 3+ (45.7%), ER-negativity (65.2%), PR-negativity (69.6%) and high proliferation (Ki-67 [≥] 60%; 47.8%), whereas lymph-node metastases were more often hormone-receptor-positive (ER+ 58.7%; PR+ 52.4%) with lower HER2 3+ (22.2%); lung metastases were intermediate (all p < 0.05). The combination of HER2 3+ and Ki-67 [≥] 60% was associated with multi-organ spread. Morphometry corroborated these patterns: liver lesions had larger atypical cells (up to 132.8 m), a higher nuclear-to-cytoplasmic ratio (0.76 vs 0.51) and more extensive necrosis and microvascularity than lymph-node lesions. A pragmatic 5-criterion morphological score (histological type, Ki-67, HER2, ER/PR status, atypical-cell size) stratified metastatic risk into three tiers. Conclusions: In this regional cohort, the IHC phenotype of breast cancer tracked the anatomical site of metastasis, with an aggressive HER2-driven, hormone-receptor-negative profile concentrated in liver metastases and a hormone-receptor-positive profile in lymph-node metastases. These findings reproduce established organotropism patterns in a previously uncharacterized population and support phenotype-aware, site-specific surveillance together with a low-cost morphological risk score for resource-limited settings.

18

SchistoTrackVideoNet: multilabel deep learning-based classification of schistosomal periportal fibrosis from ultrasound video

Ockenden, E. S.; Anguajibi, V.; Mpooya, S.; Ntegeka, B.; Mugume, T.; Nabatte, B.; Kabatereine, N. B.; Noble, A.; Chami, G. F.

2026-06-02 infectious diseases 10.64898/2026.06.01.26354613 medRxiv

Top 0.2%

1.9%

Show abstract

Schistosomiasis causes a complex, difficult to diagnose form of liver fibrosis with high rates of life-threatening morbidity in resource-poor settings where there are often no trained sonographers. Protocols for diagnosis of schistosomiasis-related liver fibrosis have focused on difficult-to-acquire and subjective ultrasound images dependent on extensive expertise. Here we present SchistoTrackVideoNet, the first deep learning-based video model trained on easy-to-acquire standardised ultrasound video sweeps for classification of schistosomiasis-related liver fibrosis. This video-based classification model was trained and evaluated on video sweeps from 2140 participants aged 5--87 years from three districts in rural Uganda. We tested the model at a clinically-relevant sensitivity threshold ($\geq$90\%) and achieved positive predictive values of 0.0968--0.5556 for diverse presentations of liver fibrosis. Our findings show potential for the use of easy-to-acquire video sweeps for diagnosis of schistosomiasis-related liver fibrosis and our model provides a proof-of-concept for deep learning applied to liver ultrasound video for diagnosis of schistosomiasis-related liver morbidity.

19

Development and validation of a digital pathology artificial intelligence (DPAI)-based biomarker predicting risk of Gleason grade group reclassification for patients who are candidates for active surveillance

Mabey, B.; Lenz, L. H.; Schiewer, M. J.; Rayford, W.; Muhammad, H.; Huang, W.; Finch, R.; Nakamoto, C.; Kouros-Mehr, H.; Jasper, J.; Basu, H.; Feng, C.; Sharma, A.; Wilding, G.; Roy, R.; Muzzey, D.; Gutin, A.

2026-05-20 oncology 10.64898/2026.05.15.26353328 medRxiv

Top 0.2%

1.9%

Show abstract

Aims Active surveillance (AS) allows selected men with localized prostate cancer to defer curative therapy and reduce treatment morbidity. Conversion from AS to treatment is commonly triggered by Gleason grade group (GGG) upgrading on confirmatory biopsy. We developed and validated a digital pathology artificial intelligence (DPAI) biomarker to predict GGG upgrading in AS-eligible patients. Materials & Methods The DPAI model was trained using histopathology image features from diagnostic biopsies of 998 patients and validated in an independent cohort of 296 patients meeting criteria for AS. Logistic regression estimated the probability of confirmatory-biopsy GGG increase, and feature selection identified the most predictive variables. Results AI-GUR (Artificial Intelligence-Gleason Upgrade Risk) predicted GGG reclassification at confirmatory biopsy (OR 1.60; p=0.0003), and provided information beyond conventional stratification (risk group, CAPRA) and cribriform morphology (all p<0.01). Predicted risks were similar across time from diagnosis (~10-15% to ~85% at 1, 1.5, or 2 years; p for time=0.50), consistent with initial biopsy mischaracterization rather than time-dependent progression. Conclusions AI-GUR provides individualized estimates of confirmatory-biopsy GGG upgrading for AS candidates. Using DPAI may improve shared decision-making by complementing standard clinicopathologic tools and molecular testing using the same biopsy specimen, while informing the likelihood of grade upgrade at confirmation.

20

Hepatocyte TEAD1 drives epithelial-stromal remodeling during cholestatic liver injury

KUMAR, A.; Lee, J.; Negi, V.; Mandi, V.; Filingeri, D.; Danvers, J.; Pant, R.; Ghosh, S.; Moulik, M.; Yechoor, V.

2026-05-26 pathology 10.64898/2026.05.21.726939 medRxiv

Top 0.2%

1.8%

Show abstract

Background & AimsPrimary sclerosing cholangitis (PSC) is a progressive cholangiopathy characterized by ductular remodeling, inflammation, and periportal fibrosis, for which effective medical therapies remain limited. The Hippo pathway effector TEAD1 has been implicated in liver regeneration and fibrogenesis; however, its role in cholestatic injury remains poorly defined. We investigated whether hepatocyte TEAD1 regulates injury-associated remodeling in a PSC-mimicking model and whether this mechanism is conserved in human PSC liver. MethodsHepatocyte-specific TEAD1 knockout mice (Alb-TEAD1-/-) and littermate controls were subjected to DDC-induced cholestatic injury. Ductular reaction, fibrosis, inflammation, and bile acid-related gene programs were assessed by histology, immunostaining, and gene expression analyses. Translational relevance was evaluated using bulk and single-cell transcriptomic datasets from human PSC liver. ResultsHepatocyte TEAD1 deletion attenuated DDC-induced fibrosis, ductular expansion, and inflammatory cell accumulation, while preserving hepatocyte proliferative responses. TEAD1-deficient livers exhibited reduced expression of profibrotic mediators, including Spp1, Ctgf, and Cyr61, with decreased extracellular matrix deposition. In contrast, canonical transcriptional adaptations to cholestatic stress, including suppression of bile acid uptake, induction of efflux pathways, and repression of bile acid synthesis genes, were preserved in the absence of TEAD1. Analysis of human PSC datasets demonstrated coordinated upregulation of TEAD1 and TEAD-associated target genes. Single-cell transcriptomic analysis further revealed hepatocyte-enriched TEAD1 expression and activation of a TEAD1 target gene program across all hepatic zones in PSC, with effect sizes exceeding those observed in non-parenchymal populations. TEAD1 activation was accompanied by co-expression of profibrotic mediators and downregulation of hepatocyte differentiation markers, consistent with a maladaptive hepatocyte state. ConclusionsHepatocyte TEAD1 drives ductular, inflammatory, and fibrogenic remodeling during cholestatic injury without disrupting bile acid metabolic adaptation. These findings identify TEAD1 as a hepatocyte-intrinsic regulator of epithelial-stromal crosstalk and establish conserved activation of this pathway in human PSC, supporting TEAD-directed signaling as a therapeutic target.